Skip to content

Conversation

@iabyn
Copy link
Contributor

@iabyn iabyn commented Jan 7, 2026

Another round of ExtUtils::ParseXS improvements.

This branch:

  • Improves error checking and reporting. Several error messages have been improved, more error messages added etc. The new errors are generally backwards-compatible: they report things which would otherwise have generated broken C code and so failed later in a more confusing way.

  • Test coverage of the warning and error messages is now comprehensive.

  • Refactors the length(foo) pseudo-parameter implementation, adds more error checks, and makes it so that foo no longer needs to be a type that maps to T_PV; merely that it maps to a type whose INPUT template contains SvPV_nolen($arg) or similar.

  • Reorganises the test files and infrastructure. The 7000 line 001-basic.t has now been split into several files, some obsolete files have been deleted, and a new naming system for test files has been introduced:

    0xx-parse-foo.t just run the parser on XS snippets and see if the generated C code matches expectations

    1xx-api-foo.t test ExtUtils::ParseXS API functions

    3xx-run-foo.t parse XS files into C files, compile and run them

    5xx-typemaps-foo.t tests for ExtUtils::Typemaps

  • This set of changes does not require a perldelta entry.

Copy link
Contributor

@jkeenan jkeenan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scope of review so far largely limited to lib/ExtUtils/ParseXS/Node.pm and limited to (i) locating places in test suite where warnings and error messages are or are not being exercised; and (ii) spelling.

@jkeenan
Copy link
Contributor

jkeenan commented Jan 7, 2026

Note: GH web interface is saying "This branch cannot be rebased due to conflicts". However, I created a local branch from the p.r. and was able to rebase it on blead.

Copy link
Contributor

@jkeenan jkeenan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I checkout this commit and attempt to configure/build/test it by itself, I get a failure during the configuration stage.

$ git checkout 195e097f4395c5332a1401201872881cacf7dec8
Note: switching to '195e097f4395c5332a1401201872881cacf7dec8'.
... [skip]
HEAD is now at 195e097f43 ParseXS: rename t/002-more.t to t/302-run-more.t
[perlmonger: perl2] $ configure_and_test
First let's make sure your kit is complete.  Checking...
ls: dist/ExtUtils-ParseXS/t/302-more.t: No such file or directory

THIS PACKAGE SEEMS TO BE INCOMPLETE.

You have the option of continuing the configuration process, despite the
distinct possibility that your kit is damaged, by typing 'y'es.  If you
do, don't blame me if something goes wrong.  I advise you to type 'n'o
and contact the author (https://github.com/Perl/perl5/issues).

Continue? [n] n
ABORTING...
[perlmonger: perl2] $ ls -l dist/ExtUtils-ParseXS/t/302-more.t
ls: dist/ExtUtils-ParseXS/t/302-more.t: No such file or directory
[perlmonger: perl2] $ ack '302-more' .
MANIFEST
4217:dist/ExtUtils-ParseXS/t/302-more.t				Extended ExtUtils::ParseXS compile and run

UU/xdg
16:dist/ExtUtils-ParseXS/t/302-more.t

UU/missing
1:ls: dist/ExtUtils-ParseXS/t/302-more.t: No such file or directory

Found during an analysis using Perl5::TestEachCommit. This could potentially cause a problem for Porting/bisect.pl. Can you adjust? Thanks.

@iabyn
Copy link
Contributor Author

iabyn commented Jan 9, 2026

If I checkout this commit and attempt to configure/build/test it by itself, I get a failure during the configuration stage.

Now fixed and rebased.

@Leont
Copy link
Contributor

Leont commented Jan 9, 2026

Could you please keep the Changes file up to date? I'm trying to sync to CPAN now and figuring out what went into which release is the hardest part of it. It doesn't have to be much, probably the PR description can be reused for it really.

@iabyn
Copy link
Contributor Author

iabyn commented Jan 10, 2026

Could you please keep the Changes file up to date? I'm trying to sync to CPAN now and figuring out what went into which release is the hardest part of it. It doesn't have to be much, probably the PR description can be reused for it really.

Now done.

@tonycoz
Copy link
Contributor

tonycoz commented Jan 15, 2026

"ParseXS: don't skip death() test" commit message: " the death() method as as it used to exit."

iabyn added 19 commits January 16, 2026 19:50
Add '#': change

    Error: 'else' with no matching 'if' in foo.xs, line 14

to be:

    Error: '#else' with no matching '#if' in foo.xs, line 14

etc.
Make this generic error message be more specific:

    Error: Unterminated '#if/#ifdef/#ifndef' in foo.xs, line 15

State what particular variety of #if wasn't terminated, and what line it
started on:

    Error: Unterminated '#ifndef' from line 12 in foo.xs, line 15
There is already a WarnHint() method which prints a warning message
(adding file/lineno) and then appends a hint message in parentheses.
This commit adds a corresponding deathHint() method.

This method will be used in the next commit.

This commit also changes it so that the hint message is wrapped in a
single set of parentheses, rather than each line being individually
wrapped.

So for example, before:

    Warning: no foo at foo.xs, line 7
        (did you forget to add)
        (a foo at the end of the line)
        (or maybe at the start of the next line?)

after:

    Warning: no foo at foo.xs, line 7
      (did you forget to add
       a foo at the end of the line
       or maybe at the start of the next line?)
Sometimes the XS parser expects the next line *not* to be indented
(usually while parsing file-scoped items).

If it finds an indented line, it used to die with this error message:

    Code is not inside a function (maybe last function was ended by a
    blank line  followed by a statement on column one?) in foo.xs, line 17

That error message was supposed to be a helpful hint for a common XS
programmer error; however, since it also obscures the *actual* error, it
can be just as confusing.

This commit changes it so that the error message makes clear that the
fault is that the current line is indented when an indented line wasn't
expected, but also gives two different hints depending on whether the
indented thing looks like a file-scoped keyword or not. So for example
this XS code:

    #define FOO 1
      PROTOTYPES: DISABLE

now gives this error message

    Error: file-scoped keywords should not be indented in foo.xs, line 13

This is because before dying, the parser now checks whether the bad line
looks a bit like an indented file-scoped keyword. If it doesn't look
like a keyword, such as:

    #define FOO 1
      blah

then it now gives this error message:

    Error: file-scoped directives must not be indented in foo.xs, line 13
      (If this line is supposed to be part of an XSUB rather than being
       file-scoped, then it is possible that your XSUB has a blank line
       followed by a line starting at column 1 which is being misinterpreted
       as the end of the current XSUB.)

which is an alternative wording of the original hint.
Change this warning message from/to:

    Didn't find a 'MODULE ... PACKAGE ... PREFIX' line

    Warning: no MODULE line found in XS file foo.xs

- As it's only a warning, include the 'Warning:' text.

- Include the name of the file.

- It is the lack of a MODULE keyword which is important; whether it
  needs a PACKAGE and PREFIX as well isn't important at this point.
When the parameters in an XSUB's signature are unparseable using the
fancy regex which handles (x = ",", y) etc, it's supposed to print a
warning.

However, the code which emits the warning was calling an unknown Warn()
method and dying instead.

Fix and add a test.

(Ideally the signature in the new test aught actually to be parseable,
but that's an issue for another day.)
Since the recent ParseXS refactoring, an XSUB where the first CASE
keyword wasn't at the start of the XSUB was, in addition to the
expected error, generating spurious additional errors. For example:

    int abc(int x, int y)
      INIT:
        myinit
      CASE: x > 0
        CODE:
          code1;
      CASE:
        CODE:
          code2;

just before this commit was outputting all of:

    Error: 'CASE:' after unconditional 'CASE:' in foo.xs, line 16
    Error: no 'CASE:' at top of function in foo.xs, line 16
    Error: no 'CASE:' at top of function in foo.xs, line 19

and after this commit (and also before refactoring) just:

    Error: no 'CASE:' at top of function in foo.xs, line 16
When the XS parser finds a line starting on column 1 which isn't a
recognised keyword or CPP directive or similar, it assumes that it must
be the start of a new XSUB. If that line is immediately followed by a
blank line, a fairly cryptic error message is emitted. For example,
these two lines:

    int

    BOO:

cause these two errors to be emitted:

    Error: function definition too short 'int' in foo.xs, line 13
    Error: function definition too short 'BOO:' in foo.xs, line 15

After this commit, the first error message is changed to:

    Error: unrecognised line: 'int' in foo.xs, line 13
      (possible start of a truncated XSUB definition?)

and the second to:

    Error: unrecognised keyword 'BOO' in foo.xs, line 15.

This change acknowledges that an unrecognised line followed by a blank
line may not always be a bad XSUB start. In particular, if it looks like
it might be keyword, albeit an unrecognised one, say so. Otherwise,
emphasise that the line can't be parsed rather than just assuming its a
bad XSUB declaration.

In addition, the reporting method has been changed from blurt() to
death(), so that no more parsing takes place. blurt() is best for
semantic errors, where the basic syntax structure is still ok and
parsing can continue. An unrecognised line means something may have gone
badly wrong, so stop.
Only one of ALIAS and INTERFACE should be used per XSUB. Otherwise the
CV's payload is sometimes an integer and sometimes a pointer. So any
generated C code is nonsense. So ban mixing them.
Previously, something like this silently passed:

    INTERFACE: f1 f1

Make it an error.

Also, remove the unused %map variable from Node::INTERFACE::parse().
There are two types of disallowing of default values with a length(foo)
parameter:

    foo(char *s, int length(s) = 0)
    foo(int length(s), char *s = "")

There are are tests only for the first form. This commit adds a test
for the second, rewords it to make it clearer, and does a proper blurt()
rather than a bare die, so proper file and line numbers are shown.

Before:

    default value not supported with length(NAME) supplied at /.../perl-5.43.3.out/lib/5.43.3/ExtUtils/ParseXS/Node.pm line 1416, <__ANONIO__> line 15.

After:

    Error: default value for s not allowed when length(s) also present in foo.xs, line 14
For some reason I managed not to add any during earlier work expanding
the test suite.
In something like

    void
    foo(int i = )

the XS parser previously silently accepted it and generated broken C
code. Now it dies.
In the standard framework for this test file, allow an optional
extra field in the  [ ...] array for each test which, if defined,
marks the test as TODO and becomes the TODO message.

This will be used in the next commit.
Move the various tests for INPUT: section syntax into a common place.
Also modernise many of those tests: many were added before the
test_many() framework was added to simplify XS syntax tests.
INPUT lines can have an optional initialiser, e.g.

    foo(i)
        int i = SvIV(ST($arg));

However if the initialiser was missing, such as:

    foo(i)
        int i = ;

then the XS parser silently succeeded, generating invalid C code.

Add a check and tests for this.

One existing test had to be modified slightly as it was now triggering
this new check; it now passes the new check, but continues to fail on
the later 'invalid parameter declaration' check, which is what it was
supposed to be testing.
Check that any file-scoped keywords appearing in XSUB scope raise an
error.
Keywords like CODE and INIT silently ignore anything on the keyword
line. For example this:

    CODE: aaa
        bbb
        ccc

will add the two lines bbb and ccc to the output C file, and quietly
ignore the aaa text. This commit instead makes it warn (but still
discard).

The new tests both test for the new warning and confirm that the junk is
being ignored.
Clean up the regex which processes OVERLOAD keyword lines. Use //x;
and since most punctuation characters within a [] character class lose
their special regex meaning, remove lots of unnecessary '\'s

Should be no change in functionality.
iabyn added 23 commits January 16, 2026 19:50
Move the test_many() sub and associated stuff out of t/001-basic.t
and into its own module so that it can be shared across multiple test
files. This is a prelude to splitting t/001-basic.t.
Move out of 001-basic.t and into their own test file, the tests which
are concerned with things which can appear at file-scope in an XS file
(i.e. outside an XSUB), apart from file-scoped keywords.

This is just several simple cut and pastes of blocks of tests; no
changes have been made to the tests themselves.

This commit is part of splitting 001-basic.t into several smaller files,
with a more coherent grouping.
Move out of 001-basic.t and into their own test file, the tests which
are concerned with keywords which can appear at file-scope in an XS file
(i.e. outside an XSUB).

This is just several simple cut and pastes of blocks of tests; no
changes have been made to the tests themselves.

This commit is part of splitting 001-basic.t into several smaller files,
with a more coherent grouping.
Move out of 001-basic.t and into their own test file, the tests which
are concerned with parsing the declaration of an XSUB (i.e. the first
two lines).

This is just several simple cut and pastes of blocks of tests; no
changes have been made to the tests themselves.

This commit is part of splitting 001-basic.t into several smaller files,
with a more coherent grouping.
Move out of 001-basic.t and into their own test file, the tests which
are concerned with parsing the individual parameters of an XSUB.

This is just several simple cut and pastes of blocks of tests; no
changes have been made to the tests themselves.

This commit is part of splitting 001-basic.t into several smaller files,
with a more coherent grouping.
Move out of 001-basic.t and into their own test file, the tests which
are concerned with parsing the return type of an XSUB.

This is just several simple cut and pastes of blocks of tests; no
changes have been made to the tests themselves.

This commit is part of splitting 001-basic.t into several smaller files,
with a more coherent grouping.
Move out of 001-basic.t and into their own test file, the tests which
are concerned with parsing the INPUT and OUTPUT keywords of an XSUB.

This is just several simple cut and pastes of blocks of tests; no
changes have been made to the tests themselves.

This commit is part of splitting 001-basic.t into several smaller files,
with a more coherent grouping.
Move out of 001-basic.t and into their own test file, the tests which
are concerned with parsing the  keywords (except INPUT and OUTPUT) of an
XSUB.

This is just several simple cut and pastes of blocks of tests; no
changes have been made to the tests themselves.

This commit is part of splitting 001-basic.t into several smaller files,
with a more coherent grouping.
Move out of 001-basic.t and into their own test file, the tests which
are concerned with parsing the C++ support features for XSUBs.

This is just several simple cut and pastes of blocks of tests; no
changes have been made to the tests themselves.

This commit is part of splitting 001-basic.t into several smaller files,
with a more coherent grouping.
In the previous commits, most of the body of t/001-basic.t has been
moved out into separate 0xx-parse-foo.t files. Rename this file now
so that it is also a 0xx-parse- file.
Add the standard boilerplate to the top of the file.
This test data file has not been used since v5.15.0-477-g9b58169ac2,
but has been hanging around ever since.
This test file has apparently never actually tested anything, and
there's already another test file, t/104-map_type.t which actually tests
the map_type() method.
This test file doesn't actually test anything.

The set_cond() method is fairly trivial; it just returns a short snippet
of C code like "items < 1" for use in an XSUB's number-of-args checking,
and there's plenty of tests in the 0xx-parse-foo.t test files for that
already.
Each of the 1xx-*.t test files run tests on a particular API function.
So rename all these files to include a -api- prefix (in the same way
that the parsing files are all 0xx-parse-foo.t and the code running are
all 3xx-run-foo.t)
Add a boilerplate comment line explaining the purpose of each file.
The 5xx-t-foo.t files all test ExtUtils::Typemaps (as opposed to all the
other test files in the t/ directory which test ExtUtils::ParseXS).
Rename the filename prefix from -t- to -typemaps- to better signal this
fact.

600-t-compat.t is a bit of an odd one - it's also testing
ExtUtils::Typemaps and I'm not sure why its 600 rather than 5xx. It's
the only 6xx in the directory. I've renamed it to 518-typemaps-compat.t
for consistency.
Add a boilerplate comment line or two explaining the purpose of each
file.
This directory just contains typemap files to be used by various
test files. Make the name give a clearer indication of its purpose.
Fix up a test to work under perl 5.8.9.

This regex:  qr/\Q...\".../ was meant to match the two literal
characters backslash and double-quote, but that only worked correctly
from 5.10.0 onwards.
This private helper module had 'strict' enabled, but not warnings.
So enable them, and fix up a problem it reveals:

the destructor for the tied handle was trying to write the buffer
contents even if the buffer was empty. This is harmless, but was now
triggering a warning for some tests which tested for an error condition
that didn't produce any output, e.g.:

    ../dist/ExtUtils-ParseXS/t/002-parse-file-scope.t ................... 1/?
    print() on unopened filehandle FH at .../CountLines.pm line 44 during global destruction.

This is also a fix for running under 5.8.9, whose test harness rather
anti-socially invokes perl with -w. Which is how I got to know about the
warnings in the first place.
A couple of TODO tests were sort of guessing what C code *should* be
generated, and were getting it wrong. For the first:

- spurious backtick;

- no space after if;

- 1 arg not 2.

For the second:

- no space after if;

Since that C code wasn't being generated yet (hence the TODO tests),
they were happily passing without it being obvious that it was wrong.
Spotted by Tony C.
@iabyn
Copy link
Contributor Author

iabyn commented Jan 16, 2026

Fixed typos in several commit messages: the one Tony pointed out, plus various misspellings of parseXS; fixed up the bad TODO tests; rebased and pushed.

@khwilliamson khwilliamson requested a review from jkeenan January 18, 2026 17:20
[ 0, qr{\Q#line 7 "(input)"\E\n codeline\n#line},
"junk ignored" ],
[ERR, qr{Warning: text after keyword ignored: 'blah'},
"should die" ],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell it shouldn't die.

Similarly below.

(introduced in "ParseXS: warn on junk following a CODEish keyword")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests don't seem to test whether the various stderr messages are fatal or not.

for my $op (split ' ', $s) {
if ($op !~ m{^
^
( [\w:"\\)+\-*/%<>.&|^!~{}=]+ )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a new issue - can I overload @{} and ${}?

@tonycoz
Copy link
Contributor

tonycoz commented Jan 19, 2026

commit "ParseXS: add more function pointer type tests" commit message

... type is a function pointpointer type

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants